Factorization and Inversion of a Million Matrices using GPUs: Challenges and Countermeasures
نویسندگان
چکیده
This paper presents new algorithmic approaches and optimization techniques for LU factorization and matrix inversion of millions of very small matrices using GPUs. These problems appear in many scientific applications including astrophysics and generation of block-Jacobi preconditioners. We show that, for very small problem sizes, design and optimization of GPU kernels require a mindset different from the one usually used when designing LAPACK algorithms for GPUs. Techniques for optimal memory traffic, register blocking, and tunable concurrency are incorporated in our proposed design. We also take advantage of the small matrix sizes to eliminate the intermediate row interchanges in both the factorization and inversion kernels. The proposed GPU kernels achieve performance speedups vs. CUBLAS of up to 6× for the factorization, and 14× for the inversion, using double precision arithmetic on a Pascal P100 GPU.
منابع مشابه
On the WZ Factorization of the Real and Integer Matrices
The textit{QIF} (Quadrant Interlocking Factorization) method of Evans and Hatzopoulos solves linear equation systems using textit{WZ} factorization. The WZ factorization can be faster than the textit{LU} factorization because, it performs the simultaneous evaluation of two columns or two rows. Here, we present a method for computing the real and integer textit{WZ} and textit{ZW} factoriz...
متن کاملTHE USE OF SEMI INHERITED LU FACTORIZATION OF MATRICES IN INTERPOLATION OF DATA
The polynomial interpolation in one dimensional space R is an important method to approximate the functions. The Lagrange and Newton methods are two well known types of interpolations. In this work, we describe the semi inherited interpolation for approximating the values of a function. In this case, the interpolation matrix has the semi inherited LU factorization.
متن کاملA Modified Digital Image Watermarking Scheme Based on Nonnegative Matrix Factorization
This paper presents a modified digital image watermarking method based on nonnegative matrix factorization. Firstly, host image is factorized to the product of three nonnegative matrices. Then, the centric matrix is transferred to discrete cosine transform domain. Watermark is embedded in low frequency band of this matrix and next, the reverse of the transform is computed. Finally, watermarked ...
متن کاملInvestigating the Effects of Hardware Parameters on Power Consumptions in SPMV Algorithms on Graphics Processing Units (GPUs)
Although Sparse matrix-vector multiplication (SPMVs) algorithms are simple, they include important parts of Linear Algebra algorithms in Mathematics and Physics areas. As these algorithms can be run in parallel, Graphics Processing Units (GPUs) has been considered as one of the best candidates to run these algorithms. In the recent years, power consumption has been considered as one of the metr...
متن کاملIEC 60870-5-104 Protocol Security Challenges and Countermeasures Identification
Industrial control systems (ICSs) which are used in critical infrastructure and other industries mostly use various communication protocols. Most of these communication protocols have various cyber security challenges and weakness that give the attackers the opportunity to gain to their malicious intentions. In this paper, we assess IEC 60870-5-104 protocols from security perspective which is u...
متن کامل